Verified Software Toolchain
نویسنده
چکیده
The software toolchain includes static analyzers to check assertions about programs; optimizing compilers to translate programs to machine language; operating systems and libraries to supply context for programs. Our Verified Software Toolchain verifies with machine-checked proofs that the assertions claimed at the top of the toolchain really hold in the machine-language program, running in the operating-system context, on a weakly-consistent-shared-memory machine. Our verification approach is modular, in that proofs about operating systems or concurrency libraries are oblivious of the programming language or machine language, proofs about compilers are oblivious of the program logic used to verify static analyzers, and so on. The approach is scalable, in that each component is verified in the semantic idiom most natural for that component. Finally, the verification is foundational: the trusted base for proofs of observable properties of the machine-language program includes only the operational semantics of the machine language, not the source language, the compiler, the program logic, or any other part of the toolchain—even when these proofs are carried out by source-level static analyzers. In this paper I explain some semantic techniques for building a verified toolchain. Consider a software toolchain comprising, A Static analyzer or program verifier that uses program invariants to check assertions about the behavior of the source program. A compiler that translates the source-language program to a machinelanguage (or other object-language) program. A runtime system (or operating system, or concurrency library) that serves as the runtime context for external function calls of the machinelanguage program. We want to construct a machine-checked proof, from the foundations of logic, that Any claims by the static analyzer about observations of the source-language program will also characterize the observations of the compiled program. We may want to attach several different static analyzers to the same compiler, or choose among several compilers beneath the same analyzer, or substitute one operating system for another. The construction and verification of just one component, such as a compiler or a static analyzer, may be as large as one project team or research group can reasonably accomplish. For both of these reasons, the interfaces between components—their specifications—deserve as much attention and effort as the components themselves. We specify the observable behavior of a concurrent program (or of a single thread of that program) as its input-output behavior, so that the statement Program p matches specification S can be expressed independently of the semantics of the programming language in which p is written, or of the machine language. Of course, those semantics will show up in the proof of the claim! Sections 11 and 12 explain this in more detail. But observable “input-output” behaviors of individual shared-memory threads are not just the input and output of atomic tokens: a lock-release makes visible (all at once) a whole batch of newly observable memory, a lock-aquire absorbs a similar batch, and a operating-system call (read into memory buffer, write from memory buffer, alloc memory buffer) is also an operation on a set of memory locations. Section 2 explains. The main technical idea of this approach is to define a thread-local semantics for a well-synchronized concurrent program and use this semantics to trick the correctness proof of the sequential compiler to prove its correctness on a concurrent program. That is, take a sequential program; characterize its observable behavior in a way that ignores intensional properties such as individual loads and stores, and focus instead on (externally visible) system calls or lock-aquire/releases that transfer a whole batch of memory locations; split into thread-local semantics of individual threads; carry them through the modified sequential proof of compiler correctness; gather the resulting statements about observable interactions of individual threads into a statement about the observable behavior of the whole binary. Because our very expressive program logic is most naturally proved sound with respect to an operational semantics with fancy features (permissions, predicates-in-theheap) that don’t exist in a standard operational semantics (or real computer), we need to erase them at some appropriate point. But when to erase? We do some erasure (what we call the transition from decorated op. sem. to angelic op. sem.) before the compilercorrectness proof; other erasure (from angelic to erased) should be done much later. We organize our proof, in Coq, as follows. (Items marked • are either completed or nearly so; items marked ◦ are in the early stages; my principal coauthors on this research (in rough chronological order) are Sandrine Blazy, Aquinas Hobor, Robert Dockins, Lennart Beringer, and Gordon Stewart. Items marked – are plausible but not even begun.) • We specify an expressive program logic for source-language programs; ◦ we instrument the static analyzer to emit witnesses in the form of invariants; ◦ we reimplement just the core of the static analyzer (invariant checker, not invariant inference engine) and prove it correct w.r.t the program logic; • we specify a decorated operational semantics for source-language programs; • we prove the soundness of the program logic w.r.t. the decorated semantics; • we specify an angelic operational semantics; • we prove a correspondence between executions of the decorated and the angelic semantics; ? we prove the correctness of the optimizing compiler w.r.t. the angelic operational semantics of the source and machine languages;
منابع مشابه
Verified Correctness and Security of OpenSSL HMAC
We have proved, with machine-checked proofs in Coq, that an OpenSSL implementation of HMAC with SHA256 correctly implements its FIPS functional specification and that its functional specification guarantees the expected cryptographic properties. This is the first machine-checked cryptographic proof that combines a source-program implementation proof, a compilercorrectness proof, and a cryptogra...
متن کاملProgram Logics - for Certified Compilers
Separation logic is the twenty-first-century variant of Hoare logic that permits verification of pointer-manipulating programs. This book covers practical and theoretical aspects of separation logic at a level accessible to beginning graduate students interested in software verification. On the practical side it offers an introduction to verification in Hoare and separation logics, simple case ...
متن کاملThe Click2NetFPGA Toolchain
High Level Synthesis (HLS) is a promising technology where algorithms described in high level languages are automatically transformed into a hardware design. Although many HLS tools exist, they are mainly targeting developers who want to use a high level programming language to design hardware modules. They are not designed to automatically compile a complete software system, such as a network ...
متن کاملVST-Flow: Fine-grained low-level reasoning about real-world C code
We show how support for information-flow security proofs could be added on top of the Verified Software Toolchain (VST). We discuss several attempts to define information flow security in a VSTcompatible way, and present a statement of information flow security in “continuation-passing” style. Moreover, we present Hoare rules augmented with information flow control assertions, and sketch how th...
متن کاملA Second Edition: Verification of a Cryptographic Primitive: SHA-256
The first edition of this paper appeared in TOPLAS 37(2) 7:1-7:31 (April 2015). It used notation compatible with the Verified Software Toolchain version 1.0, now obsolete. In this second edition there are no new scientific results, but the Verifiable C notation used corresponds to the VST 1.6 software currently in use, January 2016. Any differences between this version and the as-published TOPL...
متن کاملThe Ramifications of Mechanized Localizations within Data Structures
We develop a way to mechanically verify realistic programs that manipulate data structures with intrinsic sharing such as heaprepresented graphs. We upgrade Hobor and Villard’s theory of ramification to better support modified program variables and existential quantifiers in assertions. We develop a modular and general setup for reasoning about mathematical graphs and show how to connect this s...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2012